Life expectancy is a measure of how long an individual is expected to live on average and is commonly used in designing policy or even as a social indicator to evaluate the quality of life for any given region (see “An Overarching Health Indicator for the Post-2015 Development Agenda” 2014; also “An Overarching Health Indicator for the Post-2015 Development Agenda,” n.d.).
The goal of this project is to develop a model for predicting life expectancy in Baltimore down to single block resolution with estimates of uncertainty. The hope is that with this new information we would be able to better examine what factors contribute to the life expectancy for any given block in any given neighborhood in Baltimore city and so aid decision making when policy changes are being implemented.
We have data gotten from the city of Baltimore which gives estimates of life expectancy at the Community statistical area (CSA) level. This was done since the boundaries, and the names of the 270+ neighborhoods in Baltimore may change over time. Thus the CSA provides a consistent way to characterize a particular region of the city. Each CSA is made up of several neighborhoods and these neighborhoods may belong to more than CSA. I.e., the boundaries of a CSA may go through a neighborhood.
Since our outcome, life expectancy, is gotten at an aggregate level. This project aims to provide a street block prediction of the expected life expectancy using statistical downscaling methods.
We have data from Baltimore city website, Baltimore Neighborhood Indicators Alliance BNIA-JF, Maryland department of planning, and from the Census Bureau. The data consists of information about life expectancy estimates for each neighbourhood, along with crime, economic development and education informmation, all over a 5 year period (2010-2014). I also have street level, and block group level data.
The data fall in three general categories.
The following table gives some of the variables used in the model fitting process, which level we originally got the data at and assumptions we made to get it at a street block level
| Variables | Name | Level | Cleaning steps |
|---|---|---|---|
| propfemhh | Proportion of households headed by a female with related children under 18 years | Block group | Since the data was at a block group level and we were interested in getting street block level data, we used Kriging to interpolate data at new locations (street block locations) using the information from the block group level. The locations for the street blocks were ascertained as the median longitude and latitude of all streets that made up the street block. One of the assumptions made here was that the distribution of the variable (propfemhh) was smooth in the sense that street blocks with such households will tend to be similar. |
| propkids_withinsurance | The proportion of individuals less than 18 years who have health insurance for a given block group | Block group | Here we used the block group value as the value for each street block in that block group. The assumption here was that block groups would tend to be quite homogenous with regard to this variable. |
| racdiv | Racial diversity as calculated per block group | Block group | This variable was not given but was estimated from the block group data on race. Its estimation proceeds as follows calculate the percent of each race, square the percent for each group, sum the squares, subtract the sum from 1.00. Eight groups were used for the index: White, not Hispanic; Black or African American; American Indian and Alaska Native (AIAN); Asian; Native Hawaiian and Other Pacific Islander (NHOPI); two or more races, not Hispanic; some other race, not Hispanic; Hispanic or Latino. This method is based on that used by the census bureau. More information can be found here. We decided not to interpolate these values for the street blocks but instead used the values from block groups that they belonged to. This was done due to the unique structure of neighborhoods in Baltimore city. |
| propbelow | Proportion of individuals within a block group that lives below the poverty line | Block group | To get the data at the street block level we interpolated values from the block group level. The assumption used here is that the further into a particular neighborhood you go, the more representative each block is of the aggregate level data for this variable. |
| mhhi | Median household income | Block group | We interpolated the values for the street blocks from the block group level data using Kriging. Again this is based on the assumption that the further into a particular neighborhood you go, the more representative each block is of the aggregate level data for this variable. |
| totalincidents | # of crime incidents per street | Street | We aggregated this to get the number of crimes committed per street block |
| prop.vacant | Proportion of vacant homes | Street | We divided the number of vacant homes per street block by the total number of homes in that street block. |
The rest of the variables used in the final model includes: Percentage of Students Suspended or Expelled During School Year (susp); Liquor Outlet density per 1,000 Residents (liquor); and Percent of Residences Heated by Electricity (elheat). Note that all the variables mentioned above were observed at the CSA level. Furthermore, I did not do any interpolation for these variables at the street block level as I felt that the assumptions inherent in the process would be untenable.
Since the goal of this analysis is to predict life expectancy at the street block level and since the block information contained in the dataset was not properly defined, I made a couple of plots to see what was census block and what was a street block.
Furthermore, since some of the data files have information on neighbourhood blocks, I plotted the Neighbourhood information as defined or delineated by the block level data gotten from the Baltimore city website and then overlayed the neighbourhood data gotten from the Maryland department of planning. Futhermore, using information from the Baltimore gisdata website I was able to obtain what “block” was actually defined as. All of this points to the possiblity of using blocks from our dataset as street blocks.